Skip to content

DOC: Add missing docstrings #31047

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

galuhsahid
Copy link
Contributor

@galuhsahid galuhsahid commented Jan 15, 2020

Added docstrings:

pandas.Index.has_duplicates
pandas.Index.is_all_dates
pandas.Index.name
pandas.Index.is_boolean
pandas.Index.is_floating
pandas.Index.is_integer
pandas.Index.is_interval
pandas.Index.is_mixed
pandas.Index.is_numeric
pandas.Index.is_object

There are some other docstrings left - I couldn't find the docstrings below in the code:

None:None:GL08:pandas.Index.names:The object does not have a docstring
None:None:GL08:pandas.Index.empty:The object does not have a docstring

I don't really understand what the function does so I'm leaving the docstring empty for now:

pandas/pandas/core/indexes/base.py:639:GL08:pandas.Index.view:The object does not have a docstring

cc @datapythonista

  • closes #xxxx
  • tests added / passed
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

- pandas.Index.has_duplicates
- pandas.Index.is_all_dates
- pandas.Index.name

- pandas.Index.is_boolean
- pandas.Index.is_floating
- pandas.Index.is_integer
- pandas.Index.is_interval
- pandas.Index.is_mixed
- pandas.Index.is_numeric
- pandas.Index.is_object
Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job, added few comments, but looks really great. Thanks!


Returns
-------
boolean
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


Returns
-------
boolean
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
boolean
bool


Returns
-------
boolean
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
boolean
bool

return self.inferred_type in ["integer"]

def is_floating(self) -> bool:
"""
Check if the Index only consists of floats, NaNs, or
a mix of floats, integers, or NaNs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First line should be a single line. More information can be added later, in other paragraphs. This is for the index pages, to display correctly.

https://pandas.io/docs/development/contributing_docstring.html#section-1-short-summary


Returns
-------
boolean
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
boolean
bool


Returns
-------
boolean
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
boolean
bool


Returns
-------
boolean
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
boolean
bool


Returns
-------
boolean
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
boolean
bool


Returns
-------
boolean
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
boolean
bool

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @galuhsahid, great job!

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for those docstrings!!

Added a few comments / questions

Returns
-------
bool
Whether or not the Index has duplicate values.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a case like this, this might be a bit duplicative with the first line.

Do we (or the validation script) always require an explanation of the return type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the validation script requires a description for return values. Related error code: 'RT03': 'Return value has no description'. I confirmed this by removing one of the explanations, leaving only the return type, and the error appears when I ran python3 scripts/validate_docstrings.py --errors=RT03

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(this discussion is certainly not a blocker for this PR, to be clear)

@datapythonista what's your view on this? It's of course easiest to be consistent / have a clear rule in the validation. But personally, I find that it doesn't add any value in this specific case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it's probably a bit repetitive. I think it may add value, even if from the short summary and the the name of the function, it should be easier for most people to infer what is the output (what True and False mean), I guess beginners can appreciate having it explicit. It's difficult sometimes to know if what is obvious for us it's for other people.

In any case, assuming it literally doesn't add any value, with all the work we've got with docstrings, I would just simply move forward, since there are so many other things that I think are more important and worth more our time. I think this looks fine to me, even if the repetition is not ideal.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate content can also add noise, as you might need to read both to ensure you don't miss something.

Anyway, not a discussion to continue on this PR

is_interval : Check if the Index holds Interval objects.
is_mixed : Check if the Index holds data with mixed data types.

Returns
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "Returns" section above the "See also"

>>> idx.is_integer()
False

>>> idx = pd.Index([1, 2, 3, 4.0])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice, this is the same as above, as this will be parsed into a FloatIndex. So if we want a third example, I would maybe rather shows strings (or just leave it out)

>>> idx.is_floating()
True

>>> idx = pd.Index([1, 2, 3, 4.0, np.nan])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here as above.

I understand that from looking at the lists that are used to create the index, it looks like different cases, but all those are Float64Index objects. So for this case, I find it makes it actually more confusing (it would be rather an example to show in the main Index docstring to illustrate the constructor).

Thoughts?

Showing that it can contain NaN is of course useful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see; I was thinking by showing it, people who are not familiar with Float64Index objects will now what to expect, but I do agree that it's probably better to be an example to show in the main Index docstring. Will modify it & leave the NaNs

-------
bool
Whether or not the Index only only consists of numeric
data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
data.
data.

@datapythonista
Copy link
Member

If you have a look at the CI, there are couple of errors in the docstrings that are making it fail: https://github.com/pandas-dev/pandas/pull/31047/checks?check_run_id=391576648#step:10:24

True

>>> idx = pd.Index([1, 2, 3, 4])
>>> idx.is_integer()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
>>> idx.is_integer()
>>> idx.is_floating()

I think this is a typo

@WillAyd WillAyd added this to the 1.1 milestone Jan 16, 2020
@jorisvandenbossche jorisvandenbossche merged commit a72eef5 into pandas-dev:master Jan 17, 2020
@jorisvandenbossche
Copy link
Member

@galuhsahid Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants